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RAW SEQUENCE LISTING DATE: 12/05/2001 

PATENT APPLICATION: US/09/8 51 , 13 8A X TIME: 11:39:07 

Input Set : N:\Crf 3\RULE60\09851138A,txt 
Output Set: N:\CRF3\12052001\I851138A.raw 

SEQUENCE LISTING 
4 (1) GENERAL INFORMATION: 

6 (i) APPLICANT: MAERTENS, GEERT 

7 STUYVER, LIEVEN 

9 (ii) TITLE OF INVENTION: NEW SEQUENCES OF HEPATITIS C VIRUS GENOTYPES 

10 AND THEIR USE AS PROPHYLACTIC, THERAPEUTIC AND DIAGNOSTIC 

11 AGENTS 
13 (iii) NUMBER OF SEQUENCES: 207 

15 (iv) CORRESPONDENCE ADDRESS: 

16 (A) ADDRESSEE: ARNOLD, WHITE & DURKEE 

17 (B) STREET: P.O. BOX 4433 

18 (C) CITY: HOUSTON 

19 (D) STATE: TEXAS 

20 (E) COUNTRY: USA 

21 (F) ZIP: 77210-4433 

23 (V) COMPUTER READABLE FORM: 

24 (A) MEDIUM TYPE: Floppy disk 

25 (B) COMPUTER: IBM PC compatible 

26 (C) OPERATING SYSTEM: PC-DOS/MS -DOS 

27 (D) SOFTWARE: Microsoft Word 6.0 / ASCII text output 
29 (vi) CURRENT APPLICATION DATA: 

C--> 30 (A) APPLICATION NUMBER: US/09/851, 138A 

C--> 31 (B) FILING DATE: 09-May-2001 

C--> 41 (vii) PRIOR APPLICATION DATA: 

34 (A) APPLICATION NUMBER: 08/836,075 

35 (B) FILING DATE: 1998-01-02 

38 (A) APPLICATION NUMBER: EP 94870166.9 

39 (B) FILING DATE: 21 Oct 1994 

42 (A) APPLICATION NUMBER: EP 95870076.7 

43 (B) FILING DATE: 28 Jun 1995 
C--> 4 5 (viii) ATTORNEY/AGENT INFORMATION: 

4 6 (A) NAME: KAMMERER, PATRICIA A. 

47 (B) REGISTRATION NUMBER: 29,775 

48 (C) REFERENCE/DOCKET NUMBER: INNS: 004 
50 (2) INFORMATION FOR SEQ ID NO : 1: 

52 (i) SEQUENCE CHARACTERISTICS: 

53 (A) LENGTH: 327 base pairs 

54 (B) TYPE: nucleic acid 

55 (C) STRANDEDNESS: single 

56 (D) TOPOLOGY: linear 
58 (ii) MOLECULE TYPE: cDNA 
6 0 (iii) HYPOTHETICAL: NO 

C--> 62 (iv) ANTI-SENSE: NO 

67 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

69 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCTCAK 60 
71 GGSGTNNNNN NNCCGGGTGG CGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 12 0 

73 GGCCCCAGGN NGGGTGTGCG CGCGACTAGG AAGACTTCCG AGCGGTCACA ACCTCGTGGC 180 
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RAW SEQUENCE LISTING DATE: 12/05/2001 

PATENT APPLICATION: US/09/851, 138A TIME: 11:39:07 

Input Set : N:\Crf3\RULE60\09851138A.txt 
Output Set: N:\CRF3\12052001\I851138A.raw 

7 5 AGGCGACAGC CTATCCCCAA GGCTCGYCGG YCCGAGGGCA GGTCCTGGGC TCAGCCCGGG 24 0 

77 TATCCTTGGC CCCTCTATGG CAATGAGGGC TGCGGGTGGG CGGGNTGGCT CCTGTCCCCC 300 
79 CGCGGCTCTC GGCCCAATTG GGGCCCC 327 
81 (2) INFORMATION FOR SEQ ID NO : 2: 

83 (i) SEQUENCE CHARACTERISTICS: 

84 (A) LENGTH: 109 amino acids 

85 (B) TYPE: amino acid 

86 (D) TOPOLOGY: linear 
88 (ii) MOLECULE TYPE: peptide 

92 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

94 Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

95 1 5 ■ 10 15 
W--> 97 Arg Arg Pro Xaa Xaa Xaa Xaa Xaa Pro Gly Gly Gly Gin lie Val Gly 

98 20 25 30 

W--> 100 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Xaa Gly Val Arg Ala 

101 35 40 45 

103 Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

104 50 55 60 

W--> 106 lie Pro Lys Ala Xaa Arg Xaa Glu Gly Arg Ser Trp Ala Gin Pro Gly 

107 65 70 75 80 

W--> 109 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Xaa Trp 

110 85 90 95 

112 Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro 

113 100 105 
115 (2) INFORMATION FOR SEQ ID NO : 3: 

117 (i) SEQUENCE CHARACTERISTICS: 

118 (A) LENGTH: 447 base pairs 

119 (B) TYPE: nucleic acid 

120 (C) STRANDEDNESS : single 

121 (D) TOPOLOGY: linear 
123 (ii) MOLECULE TYPE: cDNA 
125 (iii) HYPOTHETICAL: NO 

C--> 127 (iv) ANTI-SENSE: NO 

131 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

133 GACGGCGTGA ACTATGCAAC AGGGAACTTG CCCGGTTGCT CTTTCTCTAT CTTCCTCTTG 6 0 

135 GCTTTGCTGT CCTGCTTGAC GGTTCCAACK ACCGCTCACG AGGTGCGCAA CGCATCCGGG 120 
137 GTGTATCATG TCACCAACGA CTGTTCCAAC TCGAGCATCA TCTATGAGAT GGACGGTATG 180 
139 ATCATGCACT ACCCAGGGTG CGTGCCCTGC GTTCGGGAGG ATAACCATCT CCGCTGCTGG 24 0 

141 ATGGCGCTCA CCCCCACGCT TGCGGTCAAA AAYGCTAGTG TCCCCACTRC GGCAATCCGA 300 
14 3 CGTCACGTCG ACTTGCTTGT TGGGGGNNCC ACGTTCTGTT CCGCTATGTA CGTGGGRGAC 36 0 

14 5 CTTTGCGGGT CTGTCTTCCT CGCTGGCCAG CTATTCACCT TTTCACCCCG CATGCACCAT 420 
14 7 ACAACGCAGG AGTGCAACTG CTCAATC 44 7 

14 9 (2) INFORMATION FOR SEQ ID NO : 4 : 

151 (i) SEQUENCE CHARACTERISTICS: 

152 (A) LENGTH: 149 amino acids 

153 (B) TYPE: amino acid 

154 (D) TOPOLOGY: linear 
156 (ii) MOLECULE TYPE: peptide 

160 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 
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192 (2) INFORMATION FOR SEQ ID NO : 5: 

194 (i) SEQUENCE CHARACTERISTICS: 

195 (A) LENGTH: 327 base pairs 

196 (B) TYPE: nucleic acid 

197 (C) STRANDEDNESS: single 

198 (D) TOPOLOGY: linear 
200 (ii) MOLECULE TYPE: cDNA 
202 (iii) HYPOTHETICAL: NO 

C--> 204 (iv) ANTI-SENSE: NO 

208 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

210 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCACAG 60 
212 GACGTCAAGN TCCCGGGTGG TGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 
214 GGCCCCAGGT TGGGTGTGCG CGCGACCAGG AAGACTTCCG AGCGGTCGCA GCCTCGTGAC 180 
216 AGGCGACAGC CTATTCCTAA GGCTCGCCAG TCCGATGGCA GNNCCTGGGC TCAGCCAGGG 24 0 

218 CATCCCTGGC CCCTCTATGG CAATGAGGGC TGCGGATGGG CGGGATGGCT CCTGTCCCCC 300 
220 CGCGGCTCTC GGCCCAGTTG GGGCCCC 327 
222 (2) INFORMATION FOR SEQ ID NO : 6: 

224 (i) SEQUENCE CHARACTERISTICS: 

225 (A) LENGTH: 109 amino acids 

226 (B) TYPE: amino acid 

227 (D) TOPOLOGY: linear 
229 (ii) MOLECULE TYPE: peptide 

233 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

235 Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

236 15 10 15 
W--> 238 Arg Arg Pro Gin Asp Val Lys Xaa Pro Gly Gly Gly Gin He Val Gly 

239 20 25 30 

241 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 

242 35 40 45 
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PATENT APPLICATION: US/09/851, 138A TIME: 11:39:07 

Input Set : N:\Crf3\RULE60\09851138A.txt 
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244 Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Asp Arg Arg Gin Pro 

245 50 55 60 

W--> 247 lie Pro Lys Ala Arg Gin Ser Asp Gly Xaa Xaa Trp Ala Gin Pro Gly 

248 65 70 75 80 

250 His Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 

251 85 90 95 
2 53 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro 

254 100 105 

2 56 (2) INFORMATION FOR SEQ ID NO : 7: 

258 (i) SEQUENCE CHARACTERISTICS: 

259 (A) LENGTH: 447 base pairs 

260 (B) TYPE: nucleic acid 

261 (C) STRANDEDNESS : single 

262 (D) TOPOLOGY: linear 
264 (ii) MOLECULE TYPE: cDNA 
266 (iii) HYPOTHETICAL: NO 

C--> 268 (iv) ANTI-SENSE: NO 

272 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

274 GACGGCGTGA ACTATGCAAC AGGGAATTTG CCTGGTTGCT CTTTCTCTAT CTTCCTCTTA 60 
276 GCTTTTCTGT CCTGCTTGAC GGTTCCAACT ACCGCTCATG AGGTGCGCAA CGCATCCGGG 120 
278 GTATATCATC TCACCAATGA CTGTTCCAAC TCGAGCATCA TCTATGAGAT GAGTGGTATG 180 
2 80 ATCTTGCACG CCCCAGGGTG TGTGCCCTGC GTTCGGGAGA ACAACTCTTC TCGTTGCTGG 240 



2 82 ATGCCRCTCA CCCCCACGCT TGCGGTCAAA GACGCTAATG TCCCTACTGC GGCAATCCGA 300 

284 CGCCATGTCG ACTTGCTGGT TGGGACAGCC GCGTTTCGTT CCGCTATGTA CGTGGGGGAC 360 

2 86 CTCTGCGGAT CCGTCTTCCT TGTCGGCCAG CTATTCACCT TTTCACCCCG CTTGTACCAT 420 

288 ACAACACAGG AGTGCAACTG CTCAATC 44 7 
290 (2) INFORMATION FOR SEQ ID NO : 8: 



292 (i) SEQUENCE CHARACTERISTICS: 

293 (A) LENGTH: 149 amino acids 

294 (B) TYPE: amino acid 

295 (D) TOPOLOGY: linear 
297 (ii) MOLECULE TYPE: peptide 

301 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 

303 Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser 

304 1 5 10 15 

306 lie Phe Leu Leu Ala Phe Leu Ser Cys Leu Thr Val Pro Thr Thr Ala 

307 20 25 30 

309 His Glu Val Arg Asn Ala Ser Gly Val Tyr His Leu Thr Asn Asp Cys 

310 35 40 45 

312 Ser Asn Ser Ser He He Tyr Glu Met Ser Gly Met He Leu His Ala 

313 50 55 60 

315 Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp 

316 65 70 75 80 
W--> 318 Met Xaa Leu Thr Pro Thr Leu Ala Val Lys Asp Ala Asn Val Pro Thr 

319 85 90 95 

321 Ala Ala He Arg Arg His Val Asp Leu Leu Val Gly Thr Ala Ala Phe 

322 100 105 110 

324 Arg Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val 

325 115 120 125 
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327 Gly Gin Leu Phe Thr Phe Ser Pro Arg Leu Tyr His Thr Thr Gin Glu 

328 130 135 140 

330 Cys Asn Cys Ser lie 

331 145 

333 (2) INFORMATION FOR SEQ ID NO : 9: 

335 (i) SEQUENCE CHARACTERISTICS: 

336 (A) LENGTH: 223 base pairs 

337 (B) TYPE: nucleic acid 

338 (C) STRANDEDNESS : single 

339 (D) TOPOLOGY: linear 
341 (ii) MOLECULE TYPE: cDNA 
34 3 (iii) HYPOTHETICAL: NO 

C--> 345 (iv) ANTI-SENSE: NO 

349 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 



351 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAAAGAA ACACCAACCG CCGCCCACAG 60 

353 GACGTCAAGT TCCCGGGCGG TGGCCAGATC GTTGGTGGAG TCTACGTGCT ACCGCGCAGG 120 

355 GGCCCTAGAT TGGGTGTGCG CGCAGCGCGG AAGACTTCGG AGCGGTCGCA ACCTCGTGGG 180 

357 AGGCGCCAAC CTATTCCCAA GGAGCGCCGA CCCGAGGGCA GGT 223 



359 (2) INFORMATION FOR SEQ ID NO: 10: 

361 (i) SEQUENCE CHARACTERISTICS: 

362 (A) LENGTH: 74 amino acids 

363 (B) TYPE: amino acid 

364 (D) TOPOLOGY: linear 
366 (ii) MOLECULE TYPE: peptide 

370 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

372 Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

373 1 5 10 15 

375 Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 

376 20 25 30 

378 Gly Val Tyr Val Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 

379 35 40 45 

381 Ala Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

382 50 55 60 

384 He Pro Lys Glu Arg Arg Pro Glu Gly Arg 

385 65 70 

387 (2) INFORMATION FOR SEQ ID NO: 11: 
389- (i) SEQUENCE CHARACTERISTICS: 

390 (A) LENGTH: 957 base pairs 

391 (B) TYPE: nucleic acid 

392 (C) STRANDEDNESS: single 

393 (D) TOPOLOGY: linear 
395 (ii) MOLECULE TYPE: cDNA 
397 (iii) HYPOTHETICAL: NO 

C--> 399 (iv) ANTI-SENSE: NO 

403 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



4 05 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGCA ACACCAACCG CCGCCCACAG 6 0 

4 07 GACGTTAAAT TCCCGGGTGG GGGGCAGATC GTGGGTGGAG TTTACTTGTT GCCGCGCAGG 120 

409 GGCCCCAGGT TGGGTGTGCG CGCGACGAGG AAGACTTCCG AGCGGTCGCA ACCTCGCGGA 180 

411 AGGCGACAGC CTATCCCCAA GGCTCGCCGA CCCGAGGGCA GGTCCTGGGC TCAGCCTGGG 24 0 
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VERIFICATION SUMMARY DATE: 12/05/2001 

PATENT APPLICATION: US/09/851, 138A TIME: 11:39:08 

Input Set : N :\Crf 3\RULE60\09851138A, txt 
Output Set: N:\CRF3\12052001\I851138A.raw 

L:30 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER:] 

L:31 M:220 C: Keyword misspelled or invalid format, [(B) FILING DATE:] 

L:41 M:220 C: Keyword misspelled or invalid format, [(vii) PRIOR APPLICATION DATA:] 

L:45 M:220 C: Keyword misspelled or invalid format, [(viii) ATTORNEY/AGENT INFORMATION:] 

L:62 M:220 C: Keyword misspelled or invalid format, [(iv) ANTI-SENSE:] 

L:97 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# : 2 
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L:1052 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:26 

L:1073 M:220 C: Keyword misspelled or invalid format, [(iv) ANTI -SENSE:] 

L:1119 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:28 

L:1140 M:220 C: Keyword .misspelled or invalid format, [(iv) ANTI-SENSE:] 

L:1196 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# : 30 

L:1217 M:220 C: Keyword misspelled or invalid format, [(iv) ANTI-SENSE:] 
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